Taxonomic assignment report

Department of Agriculture, Fisheries and Forestry logo
Facility Hogwarts
Analyst John Doe
Analysis started 2025-04-06 06:25:08
Analysis completed 2025-04-06 06:25:08
Wall time 0:0:0 hours

Sample details

locus 16S rRNA
preliminary_id Bacteria
taxa_of_interest
country Australia
host potato (Solanum tuberosum)
sample_id Bac2_16S
Query DNA sequence

Result overview

Inconclusive
The analyst should attempt subjective species identification at the genus level.
Reasoning - Flag 1C: >3 candidate species matched with high stringency (identity ≥ 98.5%).

Preliminary morphology ID confirmed NA

Inconclusive taxonomic identity (Flag 1C)

No taxa of interest provided at rank genus/species

Analyst evaluation

Identification of candidate species

Flag 1C: The analyst should attempt subjective species identification at the genus level
>3 candidate species matched with high stringency (identity ≥ 98.5%)

Candidate hits must meet ONE of these criteria:

Minimum alignment length 400bp
Minimum query coverage 85.0%

Candidate hits are then classified as follows:

Classification Alignment identity Number of hits Number of species
STRONG MATCH ≥ 98.5% 500 20
MODERATE MATCH ≥ 93.5% NA NA
NO MATCH < 93.5%

Hits per candidate species (top 10 candidates only)

Species Hits Identity E-value
Pectobacterium brasiliense 215 99.8% 0.0
Pectobacterium aroidearum 2 99.7% 0.0
uncultured Erwinia sp. 8 99.7% 0.0
Pectobacterium versatile 46 99.7% 0.0
uncultured bacterium 5 99.7% 0.0
Pectobacterium carotovorum 160 99.7% 0.0
Pectobacterium sp. 17 99.7% 0.0
bacterium 7 99.6% 0.0
Pectobacterium sp. 21LCBS03 1 99.4% 0.0
Pectobacterium sp. CW5 1 99.2% 0.0

Boxplot of BLAST hit identity percent grouped by genus

The boxplot above shows the identity (%) of BLAST hits grouped by genus. Each data point shows the alignment identity between the query and matched reference sequence. The analyst may wish to use this to make a subjective genus-level identification for the sample.

Analyst evaluation

Taxa of interest

This sections shows the taxa of interest specified by the submitter. Each of these taxa has been cross-referenced against the candidate species to determine if they might match the taxonomic identity of the sample. A blank row indicates a TOI that did not match any candidate species, meaning that it is unlikely that the sample matches that TOI.

See the Database coverage section to see database coverage for taxa of interest.

Reference sequence source diversity

This analysis evaluates how many independent sources have contributed to reference sequences for each candidate species. This provides a measure of confidence in the taxonomic annotation of references sequences. A sequence that has been annotated by multiple independent sources is more likely to have a correct taxonomic annotation.

Analyst evaluation

No candidate species to report on.

Database coverage of target taxa

The target taxa include candidate species, the preliminary morphology ID, and any taxa of interest provided by the submitter. Each of these taxa are independently evaluated against the reference database to determine whether sufficient reference data exists to support identification of the target taxon. Insufficient coverage of a taxon can result in that taxon not be correctly identified as the taxonomic identity of the sample. For example, if the sample is Homo sapiens, but Homo sapiens sequences are not included in the reference database, the analysis will be unable to identity Homo sapiens as the correct taxonomic identity, and will most likely assign the closest relative with reference data as the taxonomic identity.

Analyst evaluation

Database coverage

Flag 5.1C: The reference database is likely to be unreliable for this species
Reasoning: The given locus for this taxon is not present in reference database (0 entries)


Database coverage of Bacteria

Flag 5.1C: The reference database is likely to be unreliable for this species
Reasoning: The given locus for this taxon is not present in reference database (0 entries)

0 records

There are 0 sequences in the reference database for Bacteria at the given locus 16S rRNA.

Map of database coverage

Global occurrence records for Bacteria.
Note that the occurrence data are not exhaustive, and it is possible for this species to occur in regions not shown on the map.


Database coverage of species in genus Bacteria

Flag 5.2C: The database has poor support for species in this genus
Reasoning: ≤10% of taxon have reference sequence(s) at the given locus

/ (%) sequence records were found in the reference database for:

  • Species in the genus Bacteria
  • At the target locus 16S rRNA

Number of GenBank records at locus 16S rRNA


Database coverage of species in genus Bacteria that occur in country of origin

Flag 5.3C: Probability that a different related species from the country of origin is the true taxonomic identity: LOW
Reasoning: No species in genus have been observed in the country of origin

/ (%) sequence records were found in the reference database for:

  • Species in the genus Bacteria
  • At the target locus 16S rRNA
  • In the sample country of origin Australia

Number of GenBank records at locus 16S rRNA

Intraspecies diversity

This section provides a phylogeny of the candidate reference sequences. The analyst can use this to make a subjective observation on how well the reference sequences are able to distinguish between species. If the phylogeny shows distinct clades for each species, we can be confident that the molecular data are capable of distinguishing between those species. However, if the phylogeny shows overlap between species, this reduces the capacity of the molecular data to confidently distinguish between those species. In some cases, we may see the query sequence falling outside of the adjacent species' clades, which indicates that our query species is not represented in the reference database, which could indicate a rare or novel species.

Analyst evaluation

Click + drag to pan
Scroll to zoom in/out
Vertical space
Horizontal space

The phylogetic tree was constructed with FastME using the Neighbor-Joining method. Multiple-sequence alignment of the candidate reference sequences was performed using MAFFT. The visualization is rendered with TidyTree.

Taxonomy Check

The following resources can be used to ensure that the given taxonomy is legitimate and current.

Taxa Database
General GBIF
General ITIS
Mealybugs & scale ScaleNet database
Thrips Thripswiki
Spider Mites Spider Mites Database
Psocodea (Barklice, Booklice, and Parasitic Lice) Psocodea Species File Online
Orthoptera Orthoptera Species File Online
Drosophilidae TaxoDros
Diptera Catalog of the Diptera of the Australasian and Oceanian Regions
Systema Dipterorum
Aphids Aphid Species File
Ants AntWeb
AntCat
Lepidoptera (butterflies and moths) The Global Lepidoptera Names Index
Gracillariidae (primitive moths) Global Taxonomic Database of Gracillariidae
Pyralidae (pyralid moths) Global Information System on Pyraloidea
Tortricidae (tortrix moths) Tortricidae Resources on the Net